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O ' Abstract 

In 1986 Victor Miller described an algorithm for computing the Weil pairing in his unpub- 
. lished manuscript. This algorithm has then become the core of all pairing-based cryptosystems. 

Many improvements of the algorithm have been presented. Most of them involve a choice of 
elliptic curves of a special forms to exploit a possible twist during Tate pairing computation. 
04 ' Other improvements involve a reduction of the number of iterations in the Miller's algorithm. 

For the generic case, Blake, Murty and Xu proposed three refinements to Miller's algorithm 
' over Weierstrass curves. Though their refinements which only reduce the total number of vertical 

lines in Miller's algorithm, did not give an efficient computation as other optimizations, but they 
can be applied for computing both of Weil and Tate pairings on all pairing-friendly elliptic 
curves. In this paper we extend the Blake-Murty-Xu's method and show how to perform an 
elimination of all vertical lines in Miller's algorithm during Weil/Tate pairings computation on 
general elliptic curves. Experimental results show that our algorithm is faster about 25% in 
comparison with the original Miller's algorithm. 
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cn ' 1 Introduction 

p ' 

In recent years, the Weil/Tate pairings and their variants have become extremely useful in cryptog- 
raphy. The first notable application of pairings to cryptology was the work of Menezes, Okamato 
and Vanstone [1J who showed that the discrete logarithm problem on a supersingular Elliptic Curve 
can be reduced to the discrete logarithm problem in a Finite Field in 1991 due to the Weil pairing. 
Frey and Ruck [2] also consider this situation using the Tate pairing. However, the applications of 
pairings in constructing cryptographic protocols has only attracted attention after Joux' seminal 
paper describing an one-round 3-party Diffie-Hellman key exchange protocol [3] in 2000. Since then, 
the use of cryptosystems based on pairings has had a huge success with some notable breakthroughs 
such as the first practical Identity-based Encryption (IBE) scheme [lj , the short signature scheme [5] 
from Weil pairing. 

The efficient algorithms for Weil/Tate parings computation thus play a very important role in 
pairing-based cryptography. The best known method for computing Weil/Tate pairings is based on 
Miller's algorithm [6] for rational functions from scalar multiplications of divisors. The Weil pairing 
requires two Miller loops, while the Tate pairing requires only one application of the Miller loop 
and a final exponentiation. 

Consequently, many improvements on Miller's algorithm presented are based in some manner 
on it. Barreto et al. [7] pointed out that we can ignoring all terms that are contained in a proper 
subfield of ¥ k duringthe computation of Tate pairing when the elliptic curves chosen have the 
even embedding degree^. Another approach of improving the algorithm is to reduce the Miller-loop 



1 A subgroup G of the group of points of an elliptic curve E(F q ) is said to have embedding degree k if its order n 
divides q k — 1, but does not divide q l — 1 for all < i < k. 
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length by introducing variants of Tate pairings such as Eta pairing [8], Ate pairing [91 [TO], and 
optimal pairings [11[ [T2] . 

For a more generic approach, Blake, Murty and Xu |13j proposed three refinements to Miller's al- 
gorithm. Their refinements allowed to reduce the total number of vertical lines in Miller's algorithm 
thanks to an elegant observation involving conjugate of a linear function h(x, y) = k(x — a) + b — ?JE 
Though this approach did not bring a dramatic efficiency as that of Barreto et al. for Tate pairing 
computation, but it can be applied for computing both Weil and Tate pairings on any pairing- 
friendly elliptic curve. 

Recently, Boxall et al. [H] presented a variant of Miller's algorithm due to a variant of Miller's 
formulas. Similar to the approach of Blake et al., their algorithm can also be applied on general 
elliptic curves. 

In this paper we extend the Blake-Murty-Xu's method and show how to eliminate all of vertical 
lines in Miller's algorithm. Our algorithm is generically faster than the original Miller's algorithm, 
and its refinements [131 ES] for all pairing-friendly curves with any embedding degree. As previous 
refinements, our algorithm does not eliminate denominators, but it improves the performance for 
both Weil and Tate pairings computation on general pairing-based elliptic curves. Our algorithm 
is of particular interest to compute the Ate-style pairings on elliptic curves with small embedding 
degrees k, and in situations where denominators elimination using a twist is not possible (for example 
on curves with embedding degree k not of the form 2*3 J , where i > 1, j > 0). 

We also study, in this paper, a modification of our algorithm which can eliminate denominators 
when computing Tate pairing on elliptic curves with even embedding degree. The efficiency of this 
modified algorithm can thus be comparable to that of Barreto et al. [7]. 

The rest of the paper is organized as follows. We briefly recall definitions of the Weil/Tate 
pairings, Miller's algorithm and the Blake-Murty-Xu's method in Section 2. Section 3 presents 
our improvements to the original Miller's algorithm for general elliptic curves. Section 4 analyzes 
theoretically the efficiency of our algorithm and compares it with previous improvements. Section 5 
will discuss a modification without denominators applicable when the embedding degree k is even. 
Section 6 will give some experimental results. The conclusion and open problems will be given in 
Section 7. 

2 Preliminaries 

In this section, we give a brief summary of several mathematical background and the definitions 
of the Weil/Tate pairings. We review then Miller's algorithm for Weil/Tate pairing computation. 
Finally, we briefly recall the Blake-Murty-Xu's method for reducing vertical lines in Miller's algo- 
rithm. 

2.1 Divisors and Bilinear Pairings 

Let K = F q = F p m be a finite field of p characteristic with q elements and p > 3 must be a prime 
number. An elliptic curve E defined over K in short Weierstrass form is the set of solutions (x, y) 
to the following equation: 

E : y 2 = x 3 + ax + b, 

2 The equation of the conjugate of h, denoted h(x,y) is h(x,y) = k(x — a) + b + y + a\x + 0,3, where 01,03 are 
parameters of an elliptic curve of the Weierstrass form [13] . 
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together with an extra point O which is called the point at infinity of E. Where a, b G ¥ q such 
that the discriminant A = 4a 3 + 27b 2 is non-zero. 

A divisor is an element of the free abelian group Div(E) generated by the points of E. Given a 
divisor D = X^pgp n p{P)i where np€Z and only a finite number of the integers np are nonzero, 
the degree of D, denoted deg-D, is the integer Y2peE n P' an< ^ the or der of D at the point P, denoted 
ordp(-D), is the integer np. The support of D is the set of point P such that np 7^ 0. 

Let / G K(E) be a nonzero rational function. The divisor of / is indeed a finite formal sum 

div(f) = £ ord P (/)(P), 
Pes 

where ordp(/) is the order of the zero or pole of / at P, that is, ordp(/) > if P is a zero of /, 
ordp(/) < if P is a pole of /, and ordp(/) = otherwise. It follows from the definition that 
div(fg) = div(f) + div(g) and div(f jg) = div(f) — div(g) for any two nonzero rational functions / 
and g defined on E. 

Let Div°(E) be the subgroup of Div(E) consisting of divisors of degree 0. It turns out that 
div(f) has the degree (i.e. div(f) G Div°(E)) and is called principal divisor. A divisor D is 
called principal if D = div(f) for some function /. It is known that a divisor D = Ylp£E n p(^ > ) 
is principal if and only if the degree of D is zero and ^PeP n -P-^ > = O. Two divisors D\ and Di 
are equivalent on Div° (E) , denoted D\ ~ D2 , if and only if their difference D\ — D2 is a principal 
divisoiH. 

The key to the definition of pairings is the evaluation of rational functions in divisors. For any 
function / and any divisor D = J2peE n P(^ > ) °f degree 0, we define f(D) = Yl PeE f(P) np . Let r 
be an integer co-prime to the characteristic p of E, and F,Q £ E[r], where E[r] is the set of points 
of order r. Let Dp,Dq G Div°(E) be two divisors which are equivalent to (P) — (O) and (Q) — (O), 
respectively and such that Dp and Dq have disjoint supports. As rDp and rDg are principal, and 
hence there exist functions /p,/q such that div(fp) = rDp and div{fo) = rDq. Then the Weil 
pairing u) : E[r] X E[r] i-> ¥ q k is defined as 

where k is the embedding degree of E(¥ q ). 

The Tate pairing is also defined based on /p(Dq). Let P G E(W q )[r] and Q G E(¥ q k)[r] be 
linearly independent points. Then the (reduced) Tate pairing r : E(¥g)[r] x E(¥ q k)[r] \-t ¥* k is 
defined as 

t(P,Q) = fp{D Q ) q± ^ . 

The Weil/Tate pairings satisfy the properties: bilinearity, non- degeneracy and compatibility with 
isogenics. 



The twist of a curve. Let d be a factor of k, an elliptic curve E' over ¥ q k/d is called a twist of 
degree d of £7 if there exists an isomorphism ijj : E' h4 i£ defined over F d . 

A twist of .E is given by E' : y 2 = x 3 + a/3 4 x + 6/3 6 for some /3 G F-*. The isomorphism between 
E' and £ is tp : E' £ : (x', y') 1-+ (x'//3 2 , y'//3 3 ). 

3 We refer the readers to [16] for more details about divisors and rational functions. 
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2.2 The Miller's Algorithm 



The pairings over (hyper) elliptic curves are computed using the algorithm proposed by Miller [6]. 
The main part of the Miller's algorithm is constructing the rational function f r p and evaluating 
fr,p(Q) with div(f rt p) = r(P) — (rP) — [r — 1](0) for divisors P and Q. Let Gipjp be a rational 
function with 

dtv(GiPjp) = (iP) + (jP) - ([i + j]P) - (O) 

Miller's algorithm is based on the following relation describing the so-called Miller's formula, 
which is proved by considering divisors. 



fi+j,P — fi,pfj,pGip,jp. 

Let LiPjp be equation of the line passing through iP and jP (or the equation of the tangent 
line to the curve if i = j). Let ViP+jp be equation of the vertical line passing through (iP + jP) 
and — (iP + jP). In the case of elliptic curves, then 

div(L iPijP ) = (iP) + (jP) + (-[i + j]P) - 3(0), and 

div(V [i+j]P ) = ([i + j]P) + (-[i + j]P) - 2(0). 



Thus, we have 



^iPJP - 77 • 

v (i+j)P 



We say that in the case of elliptic curves, dpjp is the line passing through the points iP and 
jP divided by the vertical line passing through the point [i + j]P. 

Notice that div(fo) = div(f\) = 0, so that /o = /i = 1. Let the binary representation of r be 
r = Y2l=o bi^ 1 , where 6, € {0, 1}. Using the double-and-add method, Miller's algorithm is described 
as in Algorithm [TJ 



Input: r = J^Uo ^ with ^ G {°> 1}> P > Q G ^M; 
Output: f = f r (Q); 

T<-P,/<-l; 

for i = f — 1 to do 

/^/ 2 %F(f ,T<-2T; 
if 6j = 1 then 

end 
end 

return / 



Algorithm 1: Miller's Algorithm (P,Q,r) 
2.3 Blake-Murty-Xu's method 

Blake et al. achieved three refinements to Miller's algorithm in [13] thanks to the following obser- 
vation: 

Lemma 2.1 (Lemma 1, p3]). If the line h(x,y) = intersects with E at points P = (a,b), 
Q = (c, d) and -{P + Q) with P + Q = (a, 0), then 

h(x, y)h(x, y) = — (x — a)(x — c){x — a). 
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The proof was proven in [13J. From this lemma, they gave the following equation: 



Lt,t(Q) _ 1 m 

V*(Q)V 2T (Q) L T>T {-QY {) 

where Lt,t(—Q) = Lt,t{Q)- 

The way of Blake et al. to apply this observation to Miller's algorithm is simple. They delay 
a vertical line (V2T m a doubling step or Vp+p in an addition step) in the denominator at each 
iteration for the next iteration. Two vertical lines can thus be eliminated as we can see in Eq. (pQ). 
When most bits in the binary representation of r = $^_n 6»2* are 1, the author described the 
algorithm as in Algorithm [2j 



Input: r = Y%=Q w ^ h € {0, 1}, P, Q G E where P has order r; 
Output: f = f r (Q); 



if h- 


-1 — 


1 / 


4- 1 


else 




/<-- 


end 




for i 


= t 


if h~- 




f 


else 




f 


end 



Lp, p(Q)L 2P p(Q) 
V 2P (Q) ' 



P 2 V 2 t(Q) . 
Lt,t(~Q) ' 



T 3P ; 



2T 



c2 l 2 t,p{Q) . rp j_ 1 p . 
Lt,t(—Q) ' ' 



end 

return / 



Algorithm 2: Improved Miller's Algorithm of Blake et al. (Algorithm 4 in |13j) 
The Blake et al.'s algorithm eliminated all (two) vertical lines if the bit bi is 1. When the bit bi 
is zero, the computation cost is the same as in Miller's algorithm. Thus, the rest number of vertical 
lines that has to be computed is the number of bits in binary representation of r, or I = t — H(r), 
where H(r) is the Hamming weight of r and t is the number of bits of r. This algorithm works well 
when the Hamming weight H(r) is high. When H(r) is low, the authors also presented a refinement 
to Miller's algorithm in radix 4 (r = X^i=o^^' w ith qi € {0, 1,2,3}). This refinement can save at 
most all of two vertical lines when a pair of consecutive bits is "00" {qi = 0). However, there still 
exists one vertical line in the case of q = {1, 2}, and two vertical lines if qi = 3. 

Then, Liu et al. [15] presented a further refinement to the Miller's algorithm by using the Blake- 
Murty-Xu's method. Their improvement requires an additional algorithm for segmenting the binary 
representation of the order of subgroups r into 7 cases : (00)*, (00) J 0, (1)*, (01)*, 0(1)*, (1)*0 and 
0(1)*0. Though the algorithm is complex, but it allowed to reduce more lines than Blake-Murty-Xu's 
algorithm. 



3 Our Improvement on Miller's Algorithm 

Firstly, we present two following lemmas that are the same as lemma 1 and lemma 2 of [13] if we 
replace Lipjp(-Q) by L-{p t -jp(Q). However, unlike the proof of lemma 1 in |13] that makes use 
of a geometrical observation, our lemma is achieved by calculating divisors. 
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Lemma 3.1. Let Lipjp be equation of the line passing through iP and jP, L_ip_jp be equation 
of the line passing through —iP and —jP, and let ViP+jp be equation of the vertical line passing 
through (iP + jP) and —(iP + jP). Then, 

L iP j P (Q)L_ iP ^ jP (Q) = V iP (Q)V jP (Q)V [i+j]P (Q). (2) 

Proof. By calculating divisors, it is straightforward to see that: 

div(L iP) jp(Q)L-ip-jp(Q)) 

= div(L iPtj p(Q)) + div(L-iP-jp(Q)) 
= (iP) + (jP) + (-[i + j]P)-3(0)+ 

+ (-iP) + (-jP) + ([i+j]P)-3(0) 
= (iP) + (-iP) + (JP) + HP)+ 

+ (-[» + j]P) + ([i + j]P) - 6(0) 



div(V lP (Q)V jP (Q)V [l+j] p(Q)) 

= div(V iP (Q)) + div(V jP {Q)) + div(V [i+j]P {Q)) 
= (iP) + (-iP) - 2(0) + (jP) + (-jP)- 

-2(0) + ([i+j]P) + (-[i + j}P)-2(0) 
= (iP) + (-iP) + (jP) + (-jP)+ 

+ (-[i + j]P) + ([i + j]P) - Q(0) 

Thus, Eq. © is hold. □ 
Lemma 3.2. 

Lt,t(Q) _ 1 (3) 



V*{Q)V 2T {Q) L-t,-t(Q) 
This lemma is easy to be proven using the above lemma. 

In what follows, we use the notation Lt,t replacing for Lt,t{Q), and the notation Vp replacing 
for V T {Q). 



3.1 Blake-Murty-Xu's Refinement 

Suppose that the order of subgroup r have the binary representation r = X]i=o^2 ? . The rational 
function f r can be displayed as follows: 



i-l 



/r = /rn v (4) 

In this formula, if = 0, then L 2 \JL\ph- ,p = L 2 \ji\po = V\ r , \p- We also assume that 

Lot J ' 1 LotJ ' L oj — 1 J 



V rP = V = l. 
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In the case when most bits in the binary representation of r are 1, Blake et al. given the following 
computations: 



fr 



v 2P 



n 

i=t~i 



L 



L 



V 2 

"ixjp 

L 2' J 



V 2 l_!ljp 



Lp ) pL 2 pfi t _ 1 p 
Wp 



' L 2[ ^ iPM _ lP ' 
,_j v L -L^JJ',-L^JJ , < 



n 



(5) 



In this refinement, Blake et al. always make a delay of one vertical line for the next step for the 
purpose of applying Lemma 12.11 This trick runs well with the bit 6j = 1. However, in the case of 
hi is (there is only one vertical line dealt in this step), they added a vertical line into the current 
step. The number of vertical lines remaining to be calculated is thus equal that of bits 0. 

3.2 Our Refinement 

From the observations in |13tll5j and by combining with the Eisentrager, Lauter and Montgomery's 
trick |17| . we present a modification to Miller's algorithm that can eliminate all vertical lines. 

The function f r in Eq. fl4| is the product to start from term t to term 1. Let fr be the value 
of f r at the term i, we re-define the function f r as follows: 



' f r 

Jr 

where the function g^' is defined as follows: 



if 
if 
if 



i > t 
l<i<t 
i = L 



(6) 



,0 



Li jrip I _rip-L 2 l £~\P,bA iP 



v 2\ -iJP 



if to. 



i+l 



21 iJP,P 



(7) 



.iijP_iijP 



if to. 



i+l 



If b 



'i-l 



0, then L 



2[— JPA-i-P = JP,G = ^2[— JP- I n this case, there is no more vertical 
lines in Eq. [71 Otherwise, = 1 we will show in the following subsection how to apply the 
Eisentrager-Lauter-Montgomery's trick |17j to eliminate the vertical line V^jijp in the equation 

Li rip \J_i P -L 2 , ij P P 



v 2\ J~\P 

l 2' 



In the above equation, to« is defined as follows: 

fo 

-■TOj+i or bi-i 



mi 



if 
if 



i > t 
Ki<t. 



(8) 



Unlike as the Blake-Murty-Xu's refinement, we accept that maybe there is not any line delayed 
in some steps. If rrij = 1, there is a line delayed for the next step and otherwise. For 1 < i < t, mi 
become if and only m,j + i = 1 and the bit = 0. 
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3.3 Our Algorithm 



We use a memory variable m to note that whether there is still a vertical line delayed in the current 
step or not. At each step, we will apply Eq. ([3]) if m = 1. Without loss of generality, we assume 
that fi = 1 and V r p = 1. 

The algorithm is described by the pseudocode as in Algorithm [3l 



Input: r = ^* =0 6 i 2 i , 6, G {0,1}. 
Output: / 

T <- P, / 1, m <- 0; 
for i = f — 1 to do 

if bi = and m = then 

| / <- f • L T , r ; T <- 2T- m <- 1; 
end 

if 6j = and m = 1 then 
end 

if 6j = 1 and m = 1 then 
end 

if 6j = 1 and m = then 

| / ^ / 2 • Lt - t v ^ p ; T<-2T + P;m<-1; 
end 
end 

return / 



Algorithm 3: Improved Refinement of Miller's Algorithm for any Pairing-Friendly Elliptic 
Curve 



Remark : As the original Miller's algorithm, our algorithm cannot avoid divisions needed to update 
/. But we can reduce them easily to one inversion at the end of the addition chain (for the cost of 
one squaring in addition at the each step of the algorithm). 

We can see that the algorithm eliminated all of vertical lines except the case of line 4 of the 
Figure [3l Now, we will show how to use the Eisentrager, Lauter and Montgomery's trick in |17| to 
replace the quotient by a parabola equation. 

Eisentrager-Lauter-Montgomery's trick In [IT], the authors gave significant and useful appli- 
cation for computing f(2i+j),p directly from fc t p and ffi+j) t p instead of traditional double-and-add 
method. They constructed a parabola, whose formula can be used to replace LlP ' 3 y^ + ^ F ' lF , through 
the points iP, iP,jP, —2iP — jP as follows. 

Let iP + jP = (X3, ys) and 2iP + jP = (a?4, j/4). Then, 

LiPjpL[ i+ j]p iP 
V[i+j]P 

(y + V3- Ai(x - x 3 )){y - y 3 - X 2 (x - x 3 )) 
x — X3 
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We simplify the right half of Eq. ([9]) by expanding it in powers of x — x 3 and obtaining the 
following parabola. 



y 2 -vl 

X — X 3 



Ai(y - 2/3) - A 2 (y + y 3 ) + A x A 2 (x - x 3 ) 

= x 2 + x 3 x + x\ + a 4 + AiA 2 (x - x 3 )— 

-Ai(y-y 3 ) - A 2 (y + y 3 ) 
= x 2 + (X3 + AiA 2 )x — (Ai + A 2 )y + constant 
= (x - xi)(x + xi + x 3 + AiA 2 )— 

-(Ai + A 2 )(y-yi). 

Clearly, this substitution parabola needs less effort to evaluate at a point than to evaluate the 
quotient lP ' 3 y .^p P ' tP at that point. Additionally, the parabola does not reference 2/3, so we can 
save one multiplication for calculating 2T + P by using the double-add trick of Eisentrager et al. 

Now, we applv the Eisentrager, Lauter and Montgomery's method to construct a parabola 
replacing for T 'y 2 J T,P ■ Similarly, let 2T = (x 3 ,y 3 ), then 

r> 1 \ Lt,tL2T,p 

PtA^V)= = 

(y + y3~ AiQr - x 3 )){y -y 3 - A 2 (x - x 3 )) 
x — x 3 

= (x - xi)(x + X\ + x 3 + A1A2) - 

-(Ai + A 2 )(y- yi ), (10) 

where Ai is the slope of the line passing through T twice and — 2T, A 2 is the slope of the line passing 
through 2T, P and — 2T — P. The quotient also has zeros at T twice (i.e., tangent at T), P and 
— 2T — P and a pole of order 4 at O. By simplifying as above, we obtain a substitution parabola. 

The table [T] shows that our algorithm is more efficient than the classical Miller's algorithm as 
we save a product in the full extension field at each doubling and each addition step. The following 
subsection discusses all this in more detail. In Section [5] we describe a version without denominators 
that can be applied for computing Tate pairing on elliptic curves with even embedding degree. 



4 Efficiency comparison 

In this section we will give a performance analysis of our algorithm and make a comparison among 
the original Miller's algorithm [6], the Blake-Murty-Xu's refinements |13j . the Barreto et al.'s algo- 
rithm for computing the Tate pairing on curves with even embedding degrees [7] and Lin et al.'s 
algorithm for computing the pairings on curves with the embedding degree k = 9 [18J. 

One can consider that the cost of the algorithms for pairing computation consists of three parts: 
the cost of updating the function /, the cost of updating the point T and the cost of evaluating 
rational functions at some point Q. Without special treatment, we consider that the cost of updating 
T and the cost of evaluating rational functions Lt,t> -^2T,p a t the point Q are the same for all 
algorithms (the cost of evaluating L—t —t at a point is the same that of evaluating Ltt at that 
point). Besides, the most costly operations in pairing computations are those that take place in the 
full extension field ¥ q k. At high levels of security (i.e. k large), the complexity of operations in ¥ q k 
dominates the complexity of the operations that occur in the lower degree subfields. 
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Doubling 


Doubling and Addition 


Algorithm Q] 

(Miller's algorithm [6]) 


2S + 2M 
= 3.6M 


2S + 4M 
= 5.6M 


Algorithm [2] 

(Algorithm 4 in |13|) 


2S + 2M 
= 3.6M 


2S + 2M 
= 3.6M 


Algorithm [5] 

(Algorithm 3 in [13]) 


2S + 1M 
= 2.6M 


2S + 3M 
= 4.6M 


Barreto et al.'s algorithm [7] 


IS + 1M 
= 1.8M 


IS + 2M 
= 2.8M 


Lin et al.'s algorithm [18] 


IS + 2M 
= 2.8M 


IS + 4M 
= 4.8M 


Algorithm [3] 


2S + 1M 
= 2.6M 


2S + 2M = 3.6M(line 3) 
2S + 1M = 2.6M (line 4) 



Table 1: Comparison of the cost of updating / of Algorithms. "Doubling" is when algorithms deal 
with the bit l % = 0" and "Doubling and Addition" is when algorithms deal with the bit "6j = 1" . 

Because of the above reasons, we only focus on the cost of updating the function / which is 
generally executed on the full extension field ¥ q k ■ 

Let M, S and I denote the cost of one full extension field multiplication, one full extension field 
squaring and one full extension field inversion respectively for updating /. In following analysis, 
the ratio of one full extension field squaring to one full extension field multiplication is set to S 
= 0.8M, a commonly used value in the literature (see [IH]). The cost of one full extension field 
division that consists of one full extension field inversion I and one full extension field multiplication 
M, is generally several times more than one full extension field multiplication |19[ 120] . To avoid 
this, we manipulate the numerators and denominators separately, and perform one division at the 
very end of the algorithm (for the cost of one full extension field squaring S in addition for each bit 
treated) . 

In [7J, Barreto et al. pointed out that when the embedding degree k is even, denominators 
can be totally eliminated during Tate pairing computation. The authors observed that the point 
Q can be chosen so that its x-coordinate lie in a proper subfield, the valuation of the vertical line 
Vr+p = xq — xt+p would be in a proper subfield of ¥ q k. Thus the denominator would become 1 
when the final exponentiation is performed. Similarly, Lin et al. [15] proposed an algorithm that 
can eliminate denominators during Tate pairing computation on curves with the embedding degrees 
k = 3 l that can employ a cubic twist. On the other hand, their algorithm needs one full extension 
field multiplication compared to that of Barreto et al [7J . 

TableUgives a comparison of the cost of updating / between our algorithm (Algorithm [3|) with 
that in Miller's algorithm [6J (Algorithm [T]), Blake-Murty-Xu's algorithms in [13J (Algorithm [21 
and Algorithm [5] described in|Aj), Barreto et al.'s algorithm [7J and Lin et al's algorithm |18j . 

From Table [TJ for the generic case we can see that Algorithm [3] saves one full extension field 
multiplication when the bit hi = compared with Algorithm Q] and Algorithm [2j It has the 
same cost as Algorithm [5j When the bit hi = 1, Algorithm [3] has the same cost as Algorithm [2] 
but saves one and two full extension field multiplications in comparison to Algorithm [5] and 
Algorithm [H respectively. In total, our algorithm saves log(r) + H(r), log(r) — H(r) and H(r) 
full extension field multiplications compared with the original Miller's algorithm, Algorithm [2] and 
Algorithm (5[ respectively. Here, log(r) and H{r) denote the length in bits of the elliptic curve 
group order and the Hamming weight of the group order r, respectively. 
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When the embedding degree k gets large, the the complexity of the operations occurring in the 
full extension field ¥ qk dominates the complexity of those operations occurring in ¥ q , then, our 
algorithm is faster about 25% than the original Miller's algorithm. 

Our algorithm is also better than that of Lin et al. [18] in all case. It tradeoffs one S — M in 
the doubling step and only requires 2S + 2M instead of IS + 4M for doubling and addition step 
as in their algorithm. 

In comparison with Barreto et al.'s algorithm [7|, our algorithm takes one more full extension 
field squaring for each bit. However, as already mentioned, our approach is generic and it can be 
applied on any (pairing-friendly) elliptic curve. 

The next section present a modification of our algorithm which can be used for computing 
pairings on elliptic curves with even embedding degree. We show that the efficiency of the modified 
algorithm is comparable to Barreto et al.'s algorithm. 

5 A modification for elliptic curves with even embedding degree 

Actual implementations are adapted to twisted elliptic curves, thus the Miller's algorithm can be 
implemented more efficiently. Indeed, as pointed out in [7j such curves admit an even twist which 
allows to eliminate denominators and all irrelevant terms in the subfield of ¥ k . In the case of a 
cubic twist, denominator elimination is also possible |18| . Another advantage of embedding degrees 
of the form 2 l 3 J , where i > 1, j > is that the corresponding extensions of F can be written as 
composite extensions of degree 2 or 3, which allows faster basic arithmetic operations [22J. 

In this subsection we construct a variant of Algorithm [3] in the case of k even. 

Let v = (a + ib) be a representation of an element of ¥ qk , where a, b G F k /2, and i is a quadratic 
non-residue and 5 = i 2 . The conjugate of v over ¥ q k/2 is given by v = (a + ib) = a — ib. It follows 
that, if v 7^ 0, then 

1 _ v 
v a 2 — 5b 2 

where a 2 — 5b 2 G ¥ qk /2- Thus, in a situation where elements of ¥ qk /2 can be ignored, ^ can be 
replaced by v, thereby saving an inversion in ¥ q k [23J . 

We exploit this fact in the following modification of the algorithm, where we replace the denom- 
inator L^t,~t by its conjugate L_t,-t- 

The new algorithm works as follows: 

The factor PT t p(x,y) in the case of line 2] of the above algorithm is the parabola equation 
described in Eq. [TUl 

Table [2] gives a comparison between the modified algorithm and the original Miller's algorithm 
and Barreto et al.'s algorithm. 

From Table [21 we can see that our algorithm needs no more effort to update the function / than 
Barreto et al.'s algorithm. When the complexity of operations in ¥ qk dominate the complexity of 
the operations that occur in the lower degree subfields, the total cost of Algorithm [4] is only about 
60% of that of the original Miller's algorithm. 

6 Experiments 

We implemented our algorithms and ran some experiments on different elliptic curves at the 128- 
bits security level. For this security level, one can choose elliptic curves with the embedding degree 
6 < k < 10 when p ~ 2, and 12 < k < 20 when p ~ 1 (see |24^ Table 1]). In our implementations, 
we implemented curves with the embedding degrees k = 9 [18], k = 12 [2~T] . and k = 18 [25j . We 
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Input: r = Y J \ =Q b i 2\ b t G {0,1}. 
Output: / 

T <— P, f <— 1, m <— 0; 
for i = t — 1 to do 

1 if 6j = and m = then 

| / <- / 2 L T , T ; T <r- 2T- m <- 1; 
end 

2 if 6j = and m = 1 then 

| / «- / 2 L-t,-t; T <r- 2T; m <- ; 
end 

3 if 6j = 1 and m = 1 then 

| / <- f 2 L 2T , P L_ T) _ T ; T <- 2T + P; m <- 1; 
end 

4 if 6j = 1 and m = then 

| /^/ 2 PT,p(x,y);T^2r + P;m^l; 
end 
end 

return / 

Algorithm 4: Improved Refinement of Miller's Algorithm for Even Twisted Curves during 
Tate pairing computation 





Doubling 


Doubling and Addition 


Miller's algorithm [6] 


2S + 2M 
= 3.6M 


2S + 4M 
= 5.6M 


Barreto et al.'s algorithm [7] 


IS + 1M 
= 1.8M 


IS + 2M 
= 2.8M 


Algorithm |4] 


IS + 1M 
=1.8M 


IS + 2M =2.8M (line 3) 
IS + 1M = 1.8M (line 4) 



Table 2: Number of operations in ¥ q k during Tate pairing computation in the case of curves with 
even embedding degree 
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k 


Miller algorithm 


Our algorithms 


BLS algorithm [7] 


LZZW algorithm [18] 


9 


0.0568(a) 


0.0404(a) 




0.0517(a) 


12 


0.0509(a) 


0.0278(s) 


0.0285(s) 




18 


0.1164(a) 


0.0596(s) 


0.0613(a) 





Table 3: Timings 



compared the performance of Algorithm [H Algorithm [3j and the algorithm proposed in 2008 by 
|18| when k = 9, while when k = 12 and k = 18, we compared the performance of Algorithm [H 
Algorithm^ and the algorithm in [7J. 

The implementations which are based the library for doing Number Theory (NTL) [26J, and 
the GNU Multi-Precision package (GMP) [27], did not use any optimization trick. Computations 
on 100 random inputs are performed only on Miller function (without any final exponentiation) in 
affine coordinates. Average timings are measured on an Intel (R) Core(TM)2 Duo CPU E8500 @ 
3.16GHz, 4 GB of RAM under Ubuntu 10.10 32-bit operating system. The experimental results are 
summarized in Figure [3j 

We don't apply any twist in implementations. Thus, in the example with k = 9, the full 
extension field ¥ p k was generated as ¥ p k/(x 9 + x + 1) while when k = 12, the full extension field 
F p fc was generated as ¥ p k/(x 12 + 5), and when k = 18, the full extension field ¥ p k was generated as 
¥ pk /(x ls + x + 3). 

Parameters of used elliptic curves are given as follows: 

• For k = 9, the elliptic curve is defined by E : y 2 = x 3 + 1 over a finite field of 348-bits, and 

r = 1758592360244376049423345540022962797459 

272736402347141193268746504567484534417; 
P = 3061451105959572350992904218241517192718 

315802710560373001011629795786952195361 

19724392170588602764112177; 
P = 4/3; 

• For k = 12, the elliptic curve is defined by E : y 2 = x 3 + 5 over a finite field of 254-bits, and 

r = 160305690344031282777566882874986495155 

10226217719936227669524443298095169537; 
P = 160305690344031282777566882874986495156 

36838101184337499778392980116222246913; 

p = i; 



• For k = 18, the elliptic curve is defined by E : y 2 = x 3 + 19 over a finite field of 335-bits, and 

r = 10786994225696144150491191871486839136 

9781354128945134119237266728176832001; 
p = 58709285320900073406925617811693805623 

56430404913482030243997510873476960788 

5279673307215755454161141; 
P = 4/3; 



13 



Table [3] shows that our refinement is faster about 25%-40% in comparison to the original Miller 
algorithm. It is also faster about 20% in comparison with the algorithm in [18] which eliminates 
denominators using a cubic twist when the embedding degree k = 9. When k even, our algorithm is 
comparable to the algorithm in [7J. Table [3] also shows that at 128-bits security level, the Barreto- 
Naehrig (BN) curve [21j over a prime field of size roughly 256-bits with the embedding degree k = 12 
is the best choice. 

7 Conclusion and open problems 

In this paper we extended the Blake-Murty-Xu's method to propose further refinements to Miller's 
algorithm which is at the heart of all pairing-based cryptosystems. Our algorithm can eliminate all 
of vertical lines in the original Miller's algorithm, and so it is generically more efficient than the 
refinements of Blake-Murty-Xu [13] and that of Liu et al. |15j . We also proposed a variant that can 
eliminate denominators as in Barreto et al.'s algorithm for computing Tate pairing on even twisted 
elliptic curves [7J. 

Our improvement works perfectly well for computing both of Weil and Tate pairings over any 
pairing- friendly elliptic curve. In [llj . the author introduced the concept of optimal pairings which 
can be computed with log%r/ip{k) basic Miller iterations. For example, using Theorem 2 of |12| . it 
should be possible to find an elliptic curve with a prime embedding degree minimizing the number 
of iterations. We believe that there will be applications in pairing-based cryptography using elliptic 
curves with embedding degree not being of form 2 l 2> 3 . Further work is needed to clarify such 
questions. 
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A Algorithm 3 in [13] 

In radix-4 representation, Blake, Murty and Xu [13] presented a refinement on Miller algorithm 
which works as in Algorithm [5l 
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Input: r = £* =0 Qi A\ Qi e {0, 1, 2, 3}, P, Q G E[r\. 
Output: / 

T<-P,/<-l; 
if g r = 2 then 

end 

if g r = 3 then 

I /-/ 3 -%^F^-3P; 
end 

for i = t — 1 to do 
if (ft = then 

f <_ f4 l t,t(Q) , ji 4T . 

J J L 2T , 2T (-Q)' J ^ ^ ' 

end 

if (ft = 1 then 

f ^— f 4 L T,t(Q)' L 4T,p(Q) . rp , Arp , p. 

J J V 4T+P (Q)-L 2Tt2T (-Q)> ' 
end 

if (ft = 2 then 

f V_ f4 L T,t(Q)- L 2T,p(Q) rp Arp _|_ 9 D. 

2T+P,2T+p(~ Q) ' ' 

end 

if (ft = 3 then 

f f 4 L T,t(Q)'- L 2T,p(Q)-- L 4T+2P,p(Q) _ „ .„ „ p _ 

7 7 V 2 2 T (Q)-i2T+P,2T+p(-Q)^4T+3p(<3)' ^ ' 

end 
end 

return / 



Algorithm 5: Blake-Murty-Xu's Refinement on Miller's Algorithm in base 4 
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